PSET: A Page Segmentation Evaluation Toolkit

نویسندگان

  • Song Mao
  • Tapas Kanungo
چکیده

Empirical performance evaluation of page segmentation algorithms has become increasingly important due to the numerous algorithms that are being proposed each year. In order to choose between these algorithms for a specific domain it is important to empirically evaluate their performance. To accomplish this task the document image analysis community needs i) standardized document image datasets with groundtruth, ii) evaluation metrics that are agreed upon by researchers, and iii) freely available software for evaluating new algorithms and replicating other researchers’ results. In an earlier paper (SPIE Document Recognition and Retrieval 2000) we published evaluation results for various popular page segmentation algorithms using the University of Washington dataset. In this paper we describe the PSET evaluation package, which was used to evaluate the segmentation algorithms. The description of the package will allow researchers to understand the software better, replicate our results, evaluate new algorithms, experiment with new metrics and datasets, etc. The software is written using the C language on the SUN/UNIX platform and is being made available to researchers at no cost.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Software Architecture of Pset: a Page Segmentation Evaluation Toolkit Software Architecture of Pset: a Page Segmentation Evaluation Toolkit

Empirical performance evaluation of page segmentation algorithms has become increasingly important due to the numerous algorithms that are being proposed each year. In order to choose between these algorithms for a speciic domain it is important to empirically evaluate their performance. To accomplish this task the document image analysis community needs i) standardized document image datasets ...

متن کامل

Segmentation Evaluation

Empirical performance evaluation of page segmentation algorithms has become increasingly important due to the numerous algorithms that are being proposed each year. In order to choose between these algorithms for a speciic domain it is important to empirically evaluate their performance. To accomplish this task the document image analysis community needs i) standardized document image datasets ...

متن کامل

Persian Printed Document Analysis and Page Segmentation

This paper presents, a hybrid method, low-resolution and high-resolution, for Persian page segmentation. In the low-resolution page segmentation, a pyramidal image structure is constructed for multiscale analysis and segments document image to a set of regions. By high-resolution page segmentation, by connected components analysis, each region is segmented to homogeneous regions and identifyi...

متن کامل

Development and evaluation of large-enrollment, active- learning physical science curriculum

We report on the initial field tests of Learning Physical Science (LEPS), a new curriculum adapted from Physical Science and Everyday Thinking (PSET). PSET is an inquiry-based, hands-on, physical science curriculum that includes an explicit focus on nature of science and nature of learning. PSET was developed for small enrollment discussion/lab settings. The Learning Physical Science (LEPS) cur...

متن کامل

Ifc-compliant Design Information Modeling and Sharing

This paper presents a method for IFC-compliant design information modelling and sharing by use of the IFC technology and IFC property set (Pset) extension mechanism. The method comprises defining the nonstandard and project-specific design information in extensible and interoperable Psets, instantiating these Pset definitions with CAD object information and generating IFC-compliant product mode...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000